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Spearman’s Rank Difference formula 
6=D* 
— NOF—1) ” 


is useful in obtaining a value of 0, approximating r, from ranked 
scores and from data which are originally in the forms of ranks. 
If the original data are ranks and there are no ties in rank, the 
formula yields r. Spearman’s formula may be conveniently derived 
from the Pearson formula for correlation when the sigmas of 
the two distributions and the sigma of the distribution of the dif- 
ferences of the scores are known. This formula is 

e + 6; — ol y) (2) 


Ty = 
4 2 , Oy 





Other formulas for @ are easily derived. 
The cross-product formula for correlation using raw scores is 


= —M.M, 
ty = (3) 


= ¢ 





If ranks are correlated, means and sigmas of the two distribu- 
tions are equal. The mean of a continuous series of integers be- 


ginning with unity is ae . , and the standard deviation is 


VN —1 
> 
sum of the products of the pairs of ranks by 2R,R», we have 
ERR, (N+ 1)! | 
_"N 4 
12 


*Recommended by Dr. J. P. Guilford, January 16, 1939. 


Substituting these values in (3) and denoting the 
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which may be reduced to 


_ 12=R,R. — 3N(N + 1)? 
ain N(N? -- 1) 





_ SRR. _ 3(N+1) 
~ N(N?— 1) N—1 
12 


(6) 





wah 


If we call re Conversion Factor 1, or c,, and — ? — 


12 
Conversion Factor 2, or co, this Rank Product formula becomes 


. 
9 = VR _ , (7) 


1 


and values of the two conversion factors for any value of N from 
5 to 50 can be read from Table 1. 


Another formula for @ can be derived from Kelley’s formula 
for r when the standard deviations of the two distributions and 
the standard deviation of the distribution of the sums of the pairs 
of scores are known. This formula reads: 


Cis + y) — O: — G3 
20; Oy 





ry = 


(8) 


The standard deviation of the paired sums of the ranks is 


= iv ~ ay ie 





in which XS* is the summation of the squares of the sums of the 
ranks and XS is the summation of the sums. The sum of a con- 


tinuous series of integers beginning with unity is, however, 
N(N + 1) | 
rs and the sum of two such series, or 2S, is N(N + 1). 


Substituting in (9) we have 


o. = v= — (N +1): (10) 





_ <S*? — N*—2N?—_N 
ia N 
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The substitution of values for 6; and the sigmas of the orig- 





inal ranks in (8) gives 


YS? N?—2N?—N . N?—1 
N 6 


















N?—1 
6 
















which may be reduced to 


_ 63S?—N(N + 1) (7N +5) 
—_ N(N?— 1) aes 





















a a) 
‘~~ NQW—1) N—i _ 
6 




















N(N?—1). : ; : 
If a? is considered the third Conversion Factor, or cs, 
and ee the fourth Conversion Factor, or cy, this Rank Sum 


formula becomes 





rs? 
= = — 4 15 
‘= c (15) 


and values for c; and c, can be read from Table 1. The Conver- 
sion Factor, c3, is also useful in working with Spearman’s formula 


(1), which becomes 


— 
ray Cs 7 
TABLE I 
CONVERSION FACTORS FOR VALUES OF N FROM 5 TO 50 
(Note: Terminal 5’s which are underlined should be dropped if values 


are read to three places of decimals.) 


N Ci Ce C3 Cy 


5 10 4.5 20 10 

6 17.5 4.2 35 9.4 

7 28 4 56 9 

8 42 3.8571 8.7143 

















9 60 
10 82.5 
11 110 
12 143 
13 182 
14 227.5 
15 280 
16 340 
17 408 
18 484.5 
19 570 
20 665 
21 770 
22 885.5 
23 1012 
24 1150 
25 1300 
26 1462.5 
27 1638 
28 1827 
29 2030 
30 2247.5 
31 2480 
32 2728 
33 2992 
34 3272.5 
35 3570 
36 3885 
37 4218 
38 4569.5 
39 4940 
40 5330 
41 5740 
42 6170.5 
43 6622 
44 7095 
45 7590 
46 8107.5 
47 8648 
48 9212 
49 9800 
50 10412.5 








Since all three basic formulas for 0 
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3.75 
3.6667 


CHeck ForMULAS 


“? 


969 
1140 
1330 
1540 
1771 
2024 
2300 
2600 
2925 
3276 
3654 
4060 
4495 
4960 
5456 
5984 
6545 
7140 
7770 
8436 
9139 
9880 

10660 
11480 
12341 
13244 
14190 
15180 
16215 
17296 
18424 
19600 
20825 





8.5 
8.3333 
8.2 
8.0909 
8 
7.9231 
7.8571 
7.8 
7.75 
7.7059 
7.6667 
7.6316 
7.6 
7.5714 
7.5455 
7.5217 
7.5 
7.48 
7.4615 
7.4444 
7.4286 
7.4138 
7.4 
7.3871 
7.375 
7.3636 
7.3529 
7.3429 
7.3333 
7.3243 
7.3158 
7.3077 
7.3 
7.2927 
7.2857 
7.2791 
7.2727 
7.2667 
7.2609 
7.2553 
729 
7.2449 


(1), (5) and (13), are 
derived with the same assumptions from variations of the Pearson 
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formula for r, they can be taken as equivalent. Several check 
equations are possible. By equating the value for 9 in (1) with 
the value for 0 in (13) we obtain 


=s? + =p: = 2NG@N + > (s+ 2) (17) 


Similarly from (5) and (13) we have 


N(2N + 1) (N + 1) 
3 





YS? — 22R.R, = 


and from (1) and (5) 


2=R..R. + SD? = N(2N +” (N + 1) (19) 
From (18) and (19) a fourth check formula may be readily ob- 
tained: 
vs? — 4°R,.R,— YD?= 0 (20) 
ComMBINATION ForMULAS FOR 0 
Combination formulas for Q, involving various combinations of 
xS*, =D* and =R,R. may have some interest. For the sake of 
brevity, algebraic steps in the derivations are omitted. 


From Kelley’s sum and difference formula for r, based upon 
deviations from means in case of equal variability 


o:— oa 
r =——— (21) 
GO; * Ga 


we have 
_ ys? —yD?— N(N + 1)? 


22 
ES + SD? — N(N + 1)? _ 


Substituting the value for =S* + XD* given in the check 
formula (17) we have another combination formula: 
__ 3[sS? — ED? —N(N + 1)? 
ahi | N(N* — 1) 
This formula can also be obtained by adding (1) and (13) and 
solving for 0. 
Adding (1) and (5) and solving for Q yields 
__ 6SR.R, —3"D?—N(N + 1) (N +2) 
ude N(N?* — 1) 


(23) 


(24) 
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Similarly from (5) and (13) we have 

, — OOR.R, + 328*—N(SN+ 4) (N+1) (95) 

N(N* — 1) 
Adding (1), (5) and (13) and solving for 0 yields 

_ 2(22R,R, + ¥S'— SD?) — 3N(N + 1)! 
N(N? — 1) 
By adding (1) and (5) and subtracting (13) we obtain 
_ 6(22R,R,— YS? — ED") + N(5N + 1) (N+ 1) 
N(N* — 1) 

Adding (1) and (13) and subtracting (5) gives 


_ 6(2S*— SD’ — 2>R,R.) — 3N(N + 1)° 





io) 


‘N(N? — 1) 
Adding (5) and (13) and subtracting (1) gives 
_ 6({S* + 2=R,R, + YD?) — N(11N + 7) (N + 1) 
N(N? — 1) 


Y 


TREATMENT OF TIES 


When data originally in the form of ranks are correlated, 
Spearman’s formula and the formulas proposed above yield r 
when there are no ties. Two methods of treating ties have been 
used, the more usual of which is the Mid-Rank method. In this 
the vacant ranks are summed and divided by the number of cases 
participating in the tie. For example, if after two ranks are filled 
there are three cases tied for next place, each is assigned the rank 
of 4. The bracket method assigns to all ties the rank of the first 
vacancy, in this case 3. In both systems the last rank is N, unless 
the distribution ends in a tie. No mathematical proof has been 
offered for either method. We shall now derive a formula for a 
third method of treating ties. 


Consider two sets of ranks to be correlated, of N terms each. 
Set I consists of consecutive ranks as far as the N.th term. Then 
there are N, terms, each denoted by X, participating in a tie. 
These are followed by N. terms, denoted by G, H 
also arranged in order. The last case, J, has the rank of N. Cons. 
sponding ranks in Set II are denoted by a, b, c j, and 
while they include all ranks from 1 to N, they are not arranged 
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in order. The two sets, the squares of the sums of pairs and the 
squares of the differences are given below: 


SetI SetII 


44 
| | 
os 
+4 


MA we eee 
| | 

ES) 
MAeccece 
Ao 

+++ 


MMA « eeprnre 
+++ 
Nh bh 
MxMAecece 
CAS 

ee 


(TON, 
terms) 


++4 
bt bo 


) 
meee 
-h 
+++ 
see etree eeOee eS 


+++ 
ceeerdreee?BODeeeS 
o 


J— WJjr+y 


(TON, 
terms) 
N=N.+N,4+N, 


In summing the S”’s and the D”s together the middle terms 
cancel and we have 


SS? + SD? = 2(a? +b? jp) + 20(22 + 
2: N.?) + NX? + (G+ H? J?)] (30) 


weeelOKeeeXK Zee ener 
m~eeoermreeemnanneserwv 


teeelOKeee 
eee 
? emt] 


Qy 
+ 
iw) 
os 
of 


Check formula (16) gives the value of 2S°-+  XD* as 
2N(2N + 1) (N+1) 4, foo +1) (N+ 1) 


3 6 


The sum of the squares of N integers beginning with unity is 


= 7. Be i § which is the value of (a° + b’ 


We can therefore set up the equation 
N,”) + N,X? + 
_NQ@N+) (N+1) gy 
6 
which shows that N,X° is equal to the sum of the squares of the 
N, vacant ranks, and X is their quadratic mean. A value for X 
in terms of the mid-rank and number of vacancies can also be 


found. 
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The first vacancy in Set I is (N. + 1), the second (N. + 2) , 
and so on down to the last (N. +N). The squares of the va- 
cancies are: 


N, 
N, 
IN 


° 
e 
* 

e a s 


N.? + 2N,N, + N:? 


Summing and equating to N,X° we have 
N.X? = NN? + 2N.(1+2+3....Ni) + 


2N.N, (N, + 1) 
2 

x? = HIN! + 12N.N. + 12N, + SN s cenatatiins Y 

We can simplify this expression in terms of the mid-rank or 

M-R. The sum of N, consecutive integers beginning with N, + 1 


 ” otN, 1 ; . 2N, +N, 1 
is — CN Tet 1) . The mid-rank is therefore eet 


Ae. + 1) Gt 1) (33) 


N.X? = N.N.2 +! + 





12N,”? + 12N.N, + 12N, + 3N,* + 6N, + 3 (35) 


and M-R* = 12 





Substituting in (34) we have 


X? = M-R? + = (36) 


Accordingly, the proper value to assign tied ranks so that the 
equivalent formulas for @ will agree is 


R.= V M-R? rs eee (37) 





in which R. is the corrected rank, M-R is the mid-rank or arith- 
metic mean of the vacancies and N, is the number of cases partici- 


2 


—1 
pating in the tie. It is to be noted that — is the standard 


deviation squared of the vacancies. 
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Since the formula (37) does not depend on the value of ranks 
in the other series and involves position in the series only as it 
affects the value of the mid-rank, it is a general formula. It can 
be used with any number of cases participating in a tie, with any 
number of ties in a series and in either or both of the variables. 

Table II gives the values of R. for N,’s from 2 to 7, inclusive. 
Entry is by means of the first vacancy. For example, if after five 
ranks have been assigned, there are four cases tied, the first va- 
cancy is 6, N, is 4 and R., to be assigned to all four cases, is 
7.5829. 


TABLE II 
VALUES OF R, FROM FORMULA (37) 


(Note: Terminal 5’s which are underlined should be dropped if values 
are read to three places of decimals.) 


Initial Ni 
Vacancy 2 3 4 5 6 7 
1 1.5811 2.1602 2.7386 3.3166 3.8944 4.4721 
25495 3.1091 3.6742 4.2426 «= 4.8132 «5.3852 
3.5355 4.0825 4.6368 += 5.1962 5.7591 «6.3246 
4.5277 5.0662 5.6125 6.1644 6.7206 7.2801 
5.5227 6.0553 6.5955 7.1414 7.6920 8.2462 
6.5192 7.0475 7.5829 8.1240 8.6699 9.2195 
7.5166 8.0416 8.5732 9.1104 9.6523 10.1980 
8.5147 9.0370 9.5656 10.0995 10.6380 11.1803 
9.5131 10.0333 10.5594 11.0905 11.6261 12.1655 
10.5119 11.0303 11.5542 12.0830 12.6161 13.1529 
11.5109 12.0277 12.5499 13.0767 13.6076 14.1421 
12.5100 13.0256 13.5462 14.0712 14.6002 = 15.1327 
13.5093 14.0238 14.5430 15.0665 15.5938 16.1245 
14.5086 15.0222 15.5403 16.0624 16.5881 17.1172 
15.5081 16.0208 16.5378 17.0587 17.5831 18.1108 
16.5076 17.0196 17.5997 18.0555 18.5787 19.1050 
17.5071 18.0185 18.5338 19.0526 19.5746 20.0998 
18.5068 19.0175 19.5320 20.0499 20.5710 21.0950 
19.5064 20.0167 20.5305 21.0476 21.5677 22.0907 
20.5061 21.0159 21.5291 22.0454 22.5647 23.0868 
21.5058 22.0151 22.5278 23.0434 23.5620 24.0832 


CO ePONAUAW DN 


a a a ws es 
Vk WNeK OO 


NNR ee 
—- S00 mPND 





22.5056 
23.5053 
24.5051 
25.5049 
26.5047 
27.5045 
28.5044 
29.5042 
30.5041 
31.5040 
32.5038 
33.5037 
34.5036 
35.5035 
36.5034 
37.5033 
38,5032 
39,5032 
40.5031 
41.5030 
42.5029 


43.5029 
44.5028 


45.5027 
46.5027 


47.5026 
48.5026 


49.5025 
50.5025 
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23.0145 
24.0139 
25.0133 
26.0128 
27.0123 
28.0119 
29.0115 
30.0111 
31.0108 
32.0104 
33.0101 
34.0098 
35.0095 
36.0093 
37,0090 
38.0088 
39.0085 
40.0083 
41.0081 
42.0079 
43.0078 


44.0076 
45.0074 


46.0072 
47.0071 


48.0069 
49.0068 


50.0067 
51.0065 


23.5266 
24.5255 
25.5245 
26.5236 
27.5227 
28.5219 
29.5212 
30.5205 
31.5198 
32.5192 
33.5187 
34.5181 
35.5176 
36.5171 
37.5167 
38.5162 
39.5158 
40.5154 
41.5151 
42.5147 
43.5144 


44.5140 
45.5137 


46.5134 
47.5132 
48.5129 
49.5126 
50.5124 
51.5121 


24.0416 
25.0400 
26.0384 
27.0370 
28.0357 
29.0345 
30.0333 
31.0322 
32.0312 
33.0303 
34.0294 
35.0286 
36.0278 
37.0270 
38.0263 
39.0256 
40.0250 
41.0244 
42.0238 
43.0232 
44.0227 


45.0222 
46.0217 


47.0213 
48.0208 


49.0204 
50.0200 


51.0196 
52.0192 


24.5595 
25.5571 
26.5550 
27.5530 
28.5511 
29.5494 
30.5478 
31.5463 
32.5448 
33.5435 
34.5422 
35.5411 
36.5399 
37.5389 
38.5379 
39.5369 
40.5360 
41.5351 
42.5343 
43.5335 
44.5328 


45.5320 
46.5314 


47.5307 
48.5301 
49.5295 
50.5289 
51.5283 
52.5278 
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25.0799 
26.0768 
27.0740 
28.0713 
29.0689 
30.0666 
31.0644 
32.0624 
33.0606 


34.0588 
35.0571 
36.0555 
37.0540 
38.0526 
39.0512 
40.0500 
41.0488 
42.0476 
43.0465 
44.0454 
45.0444 
46.0435 
47.0425 
48.0416 
49.0408 


50.0400 
51.0392 


52.0384 
53.0377 


A Numericat ComPARISON OF THE ForMULAS AND OF MeEtHops 
OF TREATING TIES 


A numerical example is now given, using hypothetical data. In 
the sets of ranks given below ties are treated according to Formula 
(37), but calculations have also been made using the bracket meth- 
od and the Mid-Rank method. In Variable I there are five cases 
tied for second place, two for ninth place and three for twelfth 
place. In Variable II there are four cases tied for first place. 


Values of YS’, =R,R2, XD* and @ are given. 
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Subj. Var.I Var.II Method of ss’ SRiR> >D* 
Treating Ties 
1 2.7386 
4.2426 10 Corrected Rank 3946.7540 958.3796 113.2358 
4.2426 2.7386 Mid-Rank 3905.5 946.5 119.5 
4.2426 5 Bracket 3494 828 182 
4.2426 7 
4.2426 2.7386 
7 8 VALUES OF Q 
8 2.7386 Method of Treating Ties 
95131 6 Formula C-R M-R Bracket 
9.5131 12 Rank Difference (1-16) .7511 .7374  .6000 
11 11 Rank Sum (13-15) 7511 6604 —.2440 
13.0256 13 Rank Product (5-7) 7511 .6989 «1781 
13.0256 9 
13.0256 14 


A 
B 
C 
D 
E 
F 
G 
H 
I 
J 
K 
L 
M 
N 


Only when there are no ties or when corrected ranks are used do 
the check formulas (17, 18, 19 and 20) apply. Then all formulas 


for Q, including the combination formulas, yield identical results. 


Cuoice oF Metnops 


In the treatment of ties, the bracket method is worthless. The 
Mid-Rank method gives a fair approximation and is convenient in 
use. It may be recommended when scores are converted into ranks 
and only a rough estimate of r is required. For all exact work, 
using data originally in the form of ranks, the Corrected Rank 
method should be used in treating ties. 

Of the formulas proposed for 0 only the three basic formulas 
(1, 5 and 13, together with their variations) are worthy of actual 
use. Spearman’s Rank Difference formula is the most convenient 
when work is done by hand, and when there are no ties, or when 
the Mid-Rank method is employed. When the Corrected Rank 
method is used and a calculating machine is available, the Rank 
Product formula (5-7) is the most practical. It is also preferable 
if large numbers of 0’s are to be computed using Hollerith equip- 
ment. The Rank Sum formula (13-15) can be used fairly 
conveniently with Hollerith equipment and affords a good check 
on the Product formula. 








